<html>
<head>
<title>MDB XML Schema documentation</title>
</head>
<body>
<center><h1>MDB XML Schema documentation</h1></center>
<p><b>Author:</b> Manuel Lemos (<a href="mailto:mlemos@acm.org">mlemos@acm.org</a>)</p>
<p><b>Version control:</b> <tt>@(#) $Id: xml_schema_documentation.html,v 1.5 2002/08/26 13:30:59 lsmith Exp $</tt></p>
<h2>Contents</h2>
<li><a href="#218.2.0">Introduction to XML</a></li>
<li><a href="#218.2.1">Schema description format and XML</a></li>
<li><a href="#218.2.2">Schema description data structure</a></li>
<ul>
<li><a href="#219.3.0">database</a></li>
<li><a href="#220.3.1">table</a></li>
<li><a href="#230.3.2">declaration</a></li>
<li><a href="#230.3.3">field</a></li>
<li><a href="#233.3.4">index</a></li>
<li><a href="#234.3.5">index field</a></li>
<li><a href="#238.3.6">initialization</a></li>
<li><a href="#238.3.7">insert</a></li>
<li><a href="#238.3.8">sequence</a></li>
<li><a href="#239.3.9">sequence on table field</a></li>
<li><a href="#242.3.10">variable</a></li>
</ul>
<ul>
<h3><a name="218.2.0"><li>Introduction to <i>XML</i></li></a></h3>
<p>The schema description format is based on <i>XML</i> (<i>eXtensible Markup Language</i>).  For those that are not familiar with it, <i>XML</i> is standard that specifies rules to define abstract data formats based on tagged text.</p>
<p>Like <i>HTML</i> (<i>HyperText Markup Language</i>), <i>XML</i> is also based on <i>SGML</i> (<i>Standard Generalized Markup Language</i>).  <i>SGML</i> defines rules to structure data using special text tags.</p>
<p><i>SGML</i> tags may be used to delimit data sections.  Section begin tags are of the form <tt>&lt;name attributes... &gt;</tt> and end tags are of the form <tt>&lt;/name&gt;</tt>. <tt>name</tt> is the name of the tag and <tt>attributes</tt> is a set of zero or more pairs of attribute names and the values associated with the respective tag.</p>
<p><i>XML</i> is a little stricter in the way tags may be used. While with many <i>SGML</i> formats some end tags are optional, in <i>XML</i> end tags are always required.  Also, when tag attributes are used, attribute values must be specified always between quotes.  These <i>XML</i> requirements are usually known as <b>well-formedness</b>.</p>
<p>Another important detail about XML strictness is that tag names and attributes are case sensitive.  This means that tags in upper case are distinct from tags in lower case.</p>
<h3><a name="218.2.1"><li>Schema description format and <i>XML</i></li></a></h3>
<p>Unlike a common (but mistaken) belief, <i>XML</i> is not meant just for describing data in documents that are meant to be displayed or printed.  <i>XML</i> is a standard that defines rules for describing abstract data may be used to for any purpose.</p>
<p>Even though it may be used to schemas that may be displayed or printed, <b>Metabase</b> schema description format is meant to provide a way for developers to design their database schemas using a <i>DBMS</i> independent file format.  Using this format, developers may describe relations and properties of tables, field, indexes, sequences, etc..</p>
<p>This format uses just a subset of the <i>XML</i> known as <i>SML</i> (<i>Simplified Markup Language</i>).  <i>SML</i> formats complies with the same rules as any <i>XML</i> format but it does not use all its possibilities to make it simpler for users to write and understand the data.  For instance, files written on <i>SML</i> do not use any tag attributes.</p>
<p><b>Metabase</b> schema description format is also simpler to edit by hand because tags and constant data values should always be in lower case to save the user from holding frequently the keyboard shift key.</p>
<h3><a name="218.2.2"><li>Schema description data structure</li></a></h3>
<p>The <b>Metabase</b> schema description format lets the developers describe a set of database schema objects following the database objects hierarchy.  This means that a database may contain tables and sequence objects, tables may contain fields and index objects and all these objects have their own attributes.</p>
<p>The definition of each database schema object contained within the begin and end tags of the respective container object. Therefore, the definition of each table and sequence has to be specified between the main database begin and end tags. Likewise, the definition of fields and indexes has to be specified between the respective table begin and end tags.</p>
<p>The properties of each schema object are also defined between the respective begin and end tags.  The values of each property are also defined between the respective property being and end tags.</p>
<p>The values of the properties are subject of validation according to the type of each property and the context within which they are being defined.</p>
<p>Some properties define names of database schema objects.  There are names that are accepted as valid for some <i>DBMS</i> that are not accepted by other <i>DBMS</i>.  <b>Metabase</b> schema parser may optionally fail if such names are used to reduce the potential problems when using the same <b>Metabase</b> based application with different <i>DBMS</i>.</p>
<ul>
<p>The schema object description tags are defined as follows:</p>
<h4><a name="219.3.0"><li><tt>database</tt></li></a></h4>
<p>The <tt>database</tt> tag should be always at the top of the schema object hierarchy.  Currently it may contain the definition of two types of objects:  <tt>table</tt> and <tt>sequence</tt>.</p>
<p>The <tt>database</tt> schema object may have the following properties:</p>
<ul>
<p><li><tt>name</tt></li> (required)</p>
<p>Name of the database that is meant to be created when it is installed for the first time.</p>
<p><b>Default:</b> none</p>
<p><li><tt>create</tt></li></p>
<p>Boolean flag that indicates whether the database manager class should create the specified database or use a previously installed database of the same name.</p>
<p>This property may have to be set to <tt>0</tt> if you are splitting your database definition in multiple schema description files and only the installation of the first description file is supposed to lead to the actual database creation operation.</p>
<p>Another circumstance on which this property may have to be set to <tt>0</tt> is when the <i>DBMS</i> driver does not support database creation or if this operation requires special database administrator permissions that may not be available to the database user.</p>
<p><b>Default:</b> <tt>0</tt></p>
<p><li><tt>overwrite</tt></li></p>
<p>Boolean flag that indicates whether the database manager class should overwrite previously created structures.</p>
<p><b>Default:</b> <tt>0</tt></p>
<p><li><tt>description</tt></li></p>
<p>Free text property meant for describing the purpose of the database. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
<p><li><tt>comments</tt></li></p>
<p>Additional database comments. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
</ul>
<h4><a name="220.3.1"><li><tt>table</tt></li></a></h4>
<p>The <tt>table</tt> is one of the main database schema objects.  It may be used in a schema description multiple times, once per each table that is contained the database being described.</p>
<p>The <tt>table</tt> object definition may contain the <tt>declaration</tt> and <tt>initialization</tt> sections besides the properties.  The <tt>table</tt> schema object may have the following properties:</p>
<ul>
<p><li><tt>name</tt></li> (required)</p>
<p>Name of the table.</p>
<p><b>Default:</b> none</p>
<p><li><tt>was</tt></li></p>
<p>Name that the table had before the last time it was renamed. This property is only meant to be used when the database is being upgraded and the table is meant to be renamed.</p>
<p>Beware that if this property is not specified and the <tt>name</tt> is changed, that is intended as if it is meant to drop the table with the previous name and create a new table without keeping the data that was previously stored in the table.</p>
<p><b>Default:</b> none</p>
<p><li><tt>description</tt></li></p>
<p>Free text property meant for describing the purpose of the table. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
<p><li><tt>comments</tt></li></p>
<p>Additional table comments. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
</ul>
<p>Example:</p>
<ul>
<p><tt>&lt;table&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;users&lt;/name&gt;</tt></p>
<p><tt>&lt;declaration&gt;</tt></p>
<ul>
<p><tt>&lt;field&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;id&lt;/name&gt;</tt><br />
 <tt>&lt;type&gt;integer&lt;/type&gt;</tt><br />
 <tt>&lt;notnull&gt;1&lt;/notnull&gt;</tt><br />
 <tt>&lt;default&gt;0&lt;/default&gt;</tt></p>
</ul>
<p><tt>&lt;/field&gt;</tt><br />
 <tt>&lt;field&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;name&lt;/name&gt;</tt><br />
 <tt>&lt;type&gt;text&lt;/type&gt;</tt><br />
 <tt>&lt;length&gt;&lt;variable&gt;user_name_length&lt;/variable&gt;&lt;/length&gt;</tt></p>
</ul>
<p><tt>&lt;/field&gt;</tt></p>
</ul>
<p><tt>&lt;/declaration&gt;</tt><br />
 <tt>&lt;initialization&gt;</tt></p>
<ul>
<p><tt>&lt;insert&gt;</tt></p>
<ul>
<p><tt>&lt;field&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;id&lt;/name&gt;</tt><br />
 <tt>&lt;value&gt;1&lt;/value&gt;</tt></p>
</ul>
<p><tt>&lt;/field&gt;</tt><br />
 <tt>&lt;field&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;name&lt;/name&gt;</tt><br />
 <tt>&lt;value&gt;administrator&lt;/value&gt;</tt></p>
</ul>
<p><tt>&lt;/field&gt;</tt></p>
</ul>
<p><tt>&lt;/insert&gt;</tt></p>
</ul>
<p><tt>&lt;/initialization&gt;</tt></p>
</ul>
<p><tt>&lt;/table&gt;</tt></p>
</ul>
<h4><a name="230.3.2"><li><tt>declaration</tt></li></a></h4>
<p><tt>declaration</tt> is one of the sections that is part of the table definition.  This section is required because it must contain the definition of the table <tt>field</tt> and <tt>index</tt> objects.</p>
<h4><a name="230.3.3"><li><tt>field</tt></li></a></h4>
<p><tt>field</tt> is one of the types of table definition object. It should be specified within the table declaration section for each field that the table should contain.</p>
<p>The <tt>field</tt> schema object may have the following properties:</p>
<ul>
<p><li><tt>name</tt> (required)</li></p>
<p>Name of the field.</p>
<p><b>Default:</b> none</p>
<p><li><tt>was</tt></li></p>
<p>Name that the field had before the last time it was renamed. This property is only meant to be used when the database is being upgraded and the field is meant to be renamed.</p>
<p><b>Default:</b> none</p>
<p><li><tt>type</tt> (required)</li></p>
<p>Type of the field.  Valid type values are: <tt>integer</tt>, <tt>text</tt>, <tt>boolean</tt>, <tt>date</tt>, <tt>timestamp</tt>, <tt>time</tt>, <tt>float</tt>, <tt>decimal</tt>, <tt>clob</tt> and <tt>blob</tt>.</p>
<p><b>Default:</b> none</p>
<p><li><tt>default</tt></li></p>
<p>Default value for the field.  The allowed values depend on the type of the field.  If this property is not specified the default value is assumed to be <tt>NULL</tt>.  If there is a table index on the field being declared, the <tt>default</tt> value must be specified because indexes can not be created on table fields with <tt>NULL</tt> values.</p>
<p>Notice that some low end <i>DBMS</i> do not support specifying default values in the field definition.  Make sure that the you either specify all the values for each new row that is inserted in the tables or that the <i>DBMS</i> actually supports default values.</p>
<p>Only some <i>DBMS</i> support default values for large object fields. For this reason it is not allowed to specify default values on the definition of this type of field.</p>
<p><b>Default:</b> none</p>
<p><li><tt>notnull</tt></li></p>
<p>Boolean flag property that specifies whether the table field should be created with the constraint <tt>NOT NULL</tt>.  As it suggests, this implies that it is not allowed to set this field value with <tt>NULL</tt>.  The parser does not allow that an index is declared on a field that is not declared with the <tt>notnull</tt> flag property.</p>
<p><b>Default:</b> <tt>0</tt></p>
<p><li><tt>unsigned</tt></li></p>
<p>Boolean flag property that specifies whether an integer field should be declared as unsigned integer.  Note that some <i>DBMS</i> do not support unsigned integer fields.  In such case the <i>DBMS</i> driver will ignore this property but it issues a warning.</p>
<p><b>Default:</b> <tt>0</tt></p>
<p><li><tt>length</tt></li></p>
<p>Unsigned integer property that specifies the length of a text.  If this property is specified the text field may not have a longer length.  Text values shorter than the length are not padded.  If this property is not specified the length limit is determined by the <i>DBMS</i>.</p>
<p><b>Default:</b> none</p>
<p><li><tt>description</tt></li></p>
<p>Free text property meant for describing the purpose of the field. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
<p><li><tt>comments</tt></li></p>
<p>Additional field comments. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
</ul>
<p>Example:</p>
<ul>
<p><tt>&lt;field&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;id&lt;/name&gt;</tt><br />
 <tt>&lt;type&gt;integer&lt;/type&gt;</tt><br />
 <tt>&lt;notnull&gt;1&lt;/notnull&gt;</tt><br />
 <tt>&lt;default&gt;0&lt;/default&gt;</tt></p>
</ul>
<p><tt>&lt;/field&gt;</tt><br />
</p>
</ul>
<h4><a name="233.3.4"><li><tt>index</tt></li></a></h4>
<p><tt>index</tt> is another type of table definition object. It should also be specified within the table declaration section for each field that the table should contain.</p>
<p>The <tt>index</tt> schema object may have the following properties:</p>
<ul>
<p><li><tt>name</tt> (required)</li></p>
<p>Name of the index.</p>
<p>Only some <i>DBMS</i> support indexes on large object fields. For this reason it is not allowed to specify a large object field for indexes.</p>
<p><b>Default:</b> none</p>
<p><li><tt>was</tt></li></p>
<p>Name that the index had before the last time it was renamed. This property is only meant to be used when the database is being upgraded and the field is meant to be renamed.</p>
<p><b>Default:</b> none</p>
<p><li><tt>unique</tt></li></p>
<p>Boolean flag property that specifies whether the combination of the values of the index fields on each row of the table should unique.</p>
<p><b>Default:</b> <tt>0</tt></p>
</ul>
<h4><a name="234.3.5"><li>index <tt>field</tt></li></a></h4>
<p><tt>field</tt> is a section that is part of the table index declaration.  It should be used once per each field on which the index should be created.</p>
<p>The index <tt>field</tt> declaration may have the following properties:</p>
<ul>
<p><li><tt>name</tt> (required)</li></p>
<p>Name of the field on which the index should be created.</p>
<p><b>Default:</b> none</p>
<p><li><tt>sorting</tt></li></p>
<p>Type of field sorting that should be assumed when the index is created.  On <i>DBMS</i> that support index sorting, queries that search tables on the fields specified by the given index may execute faster if the specified sorting type is chosen adequately.  <i>DBMS</i> that do not support index sorting will ignore this property.</p>
<p>Valid sorting types are <tt>ascending</tt> and <tt>descending</tt>.</p>
<p><b>Default:</b> none</p>
</ul>
<p>Example:</p>
<ul>
<p><tt>&lt;index&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;users_index&lt;/name&gt;</tt><br />
 <tt>&lt;field&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;id&lt;/name&gt;</tt><br />
 <tt>&lt;sorting&gt;ascending&lt;/sorting&gt;</tt></p>
</ul>
<p><tt>&lt;/field&gt;</tt></p>
</ul>
<p><tt>&lt;/index&gt;</tt><br />
</p>
</ul>
<h4><a name="238.3.6"><li><tt>initialization</tt></li></a></h4>
<p><tt>initialization</tt> is another section that is part of the table definition.  This section is optional and may contain the definition of actions that should be executed when installing the database for the first time.  Currently the only action that is supported is <tt>insert</tt>.</p>
<h4><a name="238.3.7"><li><tt>insert</tt></li></a></h4>
<p><tt>insert</tt> is a table initialization command that specifies the values of the fields of rows that should be inserted in the respective table after the database is created for the first time.</p>
<p>The <tt>insert</tt> command definition only contains one more <tt>field</tt> sections with <tt>name</tt> and <tt>value</tt> pairs that indicate the values of the fields are explicitly initialized when the new row is inserted.  Missing fields are implicitly initialized by the <i>DBMS</i> with the respective default values.</p>
<h4><a name="238.3.8"><li><tt>sequence</tt></li></a></h4>
<p>The <tt>sequence</tt> is another main database schema object.  It may be used in a schema description multiple times, once per each sequence that is contained the database being described.</p>
<p>A <tt>sequence</tt> is an object that only contains an integer value.  A sequence value is automatically incremented after every time it is fetch the next sequence value.  The fetch and increment actions constitute an atomic operation.  This means that if two concurrent accesses to the database fetch the next sequence value, they will always different values.</p>
<p>Sequences are useful to generate integer numbers that may be used to store in fields that must contain unique numbers.</p>
<p>The <tt>sequence</tt> object definition may contain one <tt>on</tt> section besides the properties.  The <tt>sequence</tt> schema object may have the following properties:</p>
<ul>
<p><li><tt>name</tt></li> (required)</p>
<p>Name of the sequence.</p>
<p><b>Default:</b> none</p>
<p><li><tt>was</tt></li></p>
<p>Name that the sequence had before the last time it was renamed. This property is only meant to be used when the database is being upgraded and the sequence is meant to be renamed.</p>
<p><b>Default:</b> none</p>
<p><li><tt>start</tt></li></p>
<p>Integer property that specifies the value that the sequence will return when it is fetched the next sequence value.</p>
<p><b>Default:</b>1</p>
<p><li><tt>description</tt></li></p>
<p>Free text property meant for describing the purpose of the sequence. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
<p><li><tt>comments</tt></li></p>
<p>Additional table comments. This field is not processed by the schema parser. It may contain text or other XML well formed tags that may be useful for embedding additional data structures for documentation applications or other purposes.</p>
<p><b>Default:</b> none</p>
</ul>
<h4><a name="239.3.9"><li>sequence <tt>on</tt> table field</li></a></h4>
<p>The sequence <tt>on</tt> <tt>table</tt> <tt>field</tt> section specifies a table field on which the sequence value should be synchronized.</p>
<p>This information is meant to be used only by the database manager class when it is added a sequence later after the database has been installed for the first time.  If the sequence <tt>on</tt> is specified, the database manager class will override the sequence <tt>start</tt> value with a value that is higher than the highest value in the specified field table.  Therefore, the specified field table type must be integer.</p>
<p>Example:</p>
<ul>
<p><tt>&lt;sequence&gt;</tt></p>
<ul>
<p><tt>&lt;name&gt;users_sequence&lt;/name&gt;</tt><br />
 <tt>&lt;start&gt;1&lt;/start&gt;</tt><br />
 <tt>&lt;on&gt;</tt></p>
<ul>
<p><tt>&lt;field&gt;id&lt;/field&gt;</tt><br />
 <tt>&lt;table&gt;users&lt;/table&gt;</tt></p>
</ul>
<p><tt>&lt;/on&gt;</tt></p>
</ul>
<p><tt>&lt;/sequence&gt;</tt><br />
</p>
</ul>
<h4><a name="242.3.10"><li><tt>variable</tt></li></a></h4>
<p>The <tt>variable</tt> tag is not meant to define any database schema object but rather a means to replace property values with variables defined at parsing time.</p>
<p>For instance, if you have several tables that store user names in text fields with the same length limit, instead of hard coding the length value in all field definitions, you may use a variable reference.  The parser will replace variable references found in property definitions by the text values of the respective values passed to the parser.</p>
<p>The parser will issue an error if the it is referenced a variable that is not defined.</p>
</ul>
</body>
</html>
